Taxonomy Inference Using Kernel Dependence Measures
نویسندگان
چکیده
We introduce a family of unsupervised algorithms, numerical taxonomy clustering, to simultaneously cluster data, and to learn a taxonomy that encodes the relationship between the clusters. The algorithms work by maximizing the dependence between the taxonomy and the original data. The resulting taxonomy is a more informative visualization of complex data than simple clustering; in addition, taking into account the relations between different clusters is shown to substantially improve the quality of the clustering, when compared with state-of-the-art algorithms in the literature (both spectral clustering and a previous dependence maximization approach). We demonstrate our algorithm on image and text data.1
منابع مشابه
Learning causality by identifying common effects with kernel-based dependence measures
We describe a method for causal inference that measures the strength of statistical dependence by the Hilbert-Schmidt norm of kernelbased conditional cross-covariance operators. We consider the increase of the dependence of two variables X and Y by conditioning on a third variable Z as a hint for Z being a common effect of X and Y . Based on this assumption, we collect “votes” for hypothetical ...
متن کاملInformation Measures via Copula Functions
In applications of differential geometry to problems of parametric inference, the notion of divergence is often used to measure the separation between two parametric densities. Among them, in this paper, we will verify measures such as Kullback-Leibler information, J-divergence, Hellinger distance, -Divergence, … and so on. Properties and results related to distance between probability d...
متن کاملA Geometry Preserving Kernel over Riemannian Manifolds
Abstract- Kernel trick and projection to tangent spaces are two choices for linearizing the data points lying on Riemannian manifolds. These approaches are used to provide the prerequisites for applying standard machine learning methods on Riemannian manifolds. Classical kernels implicitly project data to high dimensional feature space without considering the intrinsic geometry of data points. ...
متن کاملKernel methods in computer vision: object localization, clustering, and taxonomy discovery
In this thesis we address three fundamental problems in computer vision using kernel methods. We first address the problem of object localization, which we frame as the problem of predicting a bounding box around an object of interest. We develop a framework in Chapter II for applying a branch and bound optimization strategy to efficiently and optimally detect a bounding box that maximizes obje...
متن کاملKernel Measures of Conditional Dependence
We propose a new measure of conditional dependence of random variables, based on normalized cross-covariance operators on reproducing kernel Hilbert spaces. Unlike previous kernel dependence measures, the proposed criterion does not depend on the choice of kernel in the limit of infinite data, for a wide class of kernels. At the same time, it has a straightforward empirical estimate with good c...
متن کامل